Skip to content

Conversation

@yamahata
Copy link
Contributor

TDX module supports timer service for L1. When L1 writes tdcs value with TDG.VP.WR(TSC DEADLINE), tdg.vp.enter() exits with timer preemption when deadline expires. (Not injecting timer interrupt to L2 guest).

Update tdx vp context shared between L1 kernel and L1 userspace so that openvmm can use TDX timer service.
Unless the userspace uses it, the L1 OHCL kernel behavior keeps the same behavior as before.

@yamahata yamahata force-pushed the ohcl-tdx-timer-service-2025-11-13 branch 4 times, most recently from 1e0e3f7 to 4b8ffbb Compare November 20, 2025 09:23
@yamahata yamahata changed the title Work-In-Progress: Ohcl tdx timer service support Ohcl tdx timer service support Nov 20, 2025
@yamahata
Copy link
Contributor Author

Now this kernel successfully boots L2 linux kernel. So I removed work-in-progress

@yamahata yamahata force-pushed the ohcl-tdx-timer-service-2025-11-13 branch 3 times, most recently from a6df45b to 8cce60e Compare November 20, 2025 19:33
TD partitioning provides a timer service for L1 (VTL2) guest to set
a preemption timer for L2 (VTL0) vCPUs.

Add members for a new timer service to the tdx_vp_context struct for
the L1 (VTL2) userspace to pass a timeout value down to the L1 (VTL2)
kernel.

Signed-off-by: Isaku Yamahata <[email protected]>
Refactor __tdcall() for a dedicated wrapper for TDG.VP.WR() operation.
This prepares for additional calls of TDG.VP.WR() cleanly while avoiding
repeated open-coding.

No functional change intended.

Signed-off-by: Isaku Yamahata <[email protected]>
*
* TDX TDVPS deadline:
* 0: immediate inject timer interrupt.
* -1: disarmed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this disarmed value present in the spec, or just an effect of setting an all Fs TSC value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in the public spec,
Intel® Trust Domain Extensions (Intel® TDX) Module
TD Partitioning Architecture Specification
354807-005US
September 2025

23.13.2. L2 VM TSC Deadline Support

Setting TSC_DEADLINE to -1 disables its operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add this reference on the next update.

};

/*
* The L1 VMM needs to tell wake up time from HLT emulation because The host
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: capitalization here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will fix. Do you mean "The" => "the" after "because". If not, please concretely point out which word to captlize.

}
raw_local_irq_enable();
} else {
enum TDX_HALT_TIMER armed;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i'd want some other reviewers to chime in on how they want to manage this TDX specific code here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this change would be intrusive, review from someone else would help.

struct mshv_vtl_per_cpu *per_cpu = this_cpu_ptr(&mshv_vtl_per_cpu);
u64 vm_idx = TDG_VP_ENTRY_VM_IDX(context->entry_rcx);

if (is_tdx_vm_idx_valid(vm_idx))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use the prev value here because there wasn't an update call on this run? This handles the case when the timer was disarmed or disabled by the guest (because we set a large value of 0xFFs), is that right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use the prev value here because there wasn't an update call on this run?

Yes.

This handles the case when the timer was disarmed or disabled by the guest (because we set a large value of 0xFFs), is that right?

Yes. There are several cases covered. The scenarios are

  • the L2(VTL0) guest updates timer => userspace openvmm sets deadline and update=1 and run L2 vCPU. This can be arming or disarming depending on the value.
  • optional: The kernel may run L2 vCPU and back to L1(VTL2) before expiring timer. Set case update = 0, and remember the value in previous value.
  • In L1 kernel, go to HLT emulation. In the sentence try to find the timer expiring value. in context if update = 1 or remembered previous value.

Program the TD partitioning TSC deadline timer service for L2 (VTL0) vCPUs
when the L1 (VTL2) userspace requests.  Then, the TDX module sets
preemption timer for L2 vCPU.  If the timer expires, the L2 (VTL0) vCPU
exits with a VMX preemption timer exit reason.  The mshv_vtl driver then
exits to the userspace, and the userspace is notified of the exit.

The TDX module does not clear TDVPS deadline on a preemption timer exit.
Disarm the TSC deadline explicitly on the preemption timer exit.  Otherwise
the following TDG.VP.ENTER() immediately exits without executing the L2
guest.

Signed-off-by: Isaku Yamahata <[email protected]>
As the tdcall is slow, cache the previously written TSC deadline value and
skip unnecessary tdg.vp.wr(TSC deadline) if the value doesn't change.  This
is also a preparation for hlt emulation case that requires the previously
written TSC deadline value.

Signed-off-by: Isaku Yamahata <[email protected]>
The TDX timer service sets a preemption timer for the L2 (VTL0) vCPU.
tdg.vp.enter() exits with preemption timer exit reason on timer expiry.
The HLT emulation path needs extra change where the L1 (VTL2) kernel issues
TDG.VP.VMCALL(HLT) because the host (L0) VMM doesn't know the L2 deadline
timer value.

When the L1 kernel issues TDG.VP.VMCALL(HLT), start per-CPU hrtimer to wake
up from the L0 HLT emulation by L1 getting timer interrupt.  Cancel the
hrtimer after it returns from the L0 VMM.

Signed-off-by: Isaku Yamahata <[email protected]>
On timer expiry path, it unconditionally issues
tdg.vp.wr(TSC deadline = disarm).  The following tdg.vp.enter() execution
path may overwrite tdg.vp.wr(new TSC deadline).  Delete the duplicated
tdg.vp.wr() call as optimization.

Signed-off-by: Isaku Yamahata <[email protected]>
Add an extension for the TDX timer service, so that the userspace can query
the feature before use.

Signed-off-by: Isaku Yamahata <[email protected]>
…akeup AP callback")

The commit df21bf3 ("arch/x86: Provide the CPU number in the wakeup AP
callback") changed the signature of struct apic::wakeup_secondary_cpu(),
but it did not update numachip_wakeup_secondary().  Update it to fix the
compile error.

arch/x86/kernel/apic/apic_numachip.c:228:43: error: initialization of 'int (*)(u32,  long unsigned int,  unsigned int)' {aka 'int (*)(unsigned int,  long unsigned int,  unsigned int)'} from incompatible pointer type 'int (*)(u32,  long unsigned int)' {aka 'int (*)(unsigned int,  long unsigned int)'} [-Wincompatible-pointer-types]
  228 |         .wakeup_secondary_cpu           = numachip_wakeup_secondary,
      |                                           ^~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: df21bf3 ("arch/x86: Provide the CPU number in the wakeup AP callback")
Signed-off-by: Isaku Yamahata <[email protected]>
@yamahata yamahata force-pushed the ohcl-tdx-timer-service-2025-11-13 branch from ac07f44 to 49cea0c Compare November 26, 2025 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants